Linearly convergent stochastic heavy ball method for minimizing generalization error
نویسندگان
چکیده
In this work we establish the first linear convergence result for the stochastic heavy ball method. The method performs SGD steps with a fixed stepsize, amended by a heavy ball momentum term. In the analysis, we focus on minimizing the expected loss and not on finite-sum minimization, which is typically a much harder problem. While in the analysis we constrain ourselves to quadratic loss, the overall objective is not necessarily strongly convex.
منابع مشابه
Stochastic Optimization with Variance Reduction for Infinite Datasets with Finite Sum Structure
Stochastic optimization algorithms with variance reduction have proven successful for minimizing large finite sums of functions. However, in the context of empirical risk minimization, it is often helpful to augment the training set by considering random perturbations of input examples. In this case, the objective is no longer a finite sum, and the main candidate for optimization is the stochas...
متن کاملNon-Uniform Stochastic Average Gradient Method for Training Conditional Random Fields
We apply stochastic average gradient (SAG) algorithms for training conditional random fields (CRFs). We describe a practical implementation that uses structure in the CRF gradient to reduce the memory requirement of this linearly-convergent stochastic gradient method, propose a non-uniform sampling scheme that substantially improves practical performance, and analyze the rate of convergence of ...
متن کاملLinear Convergence of Accelerated Stochastic Gradient Descent for Nonconvex Nonsmooth Optimization
In this paper, we study the stochastic gradient descent (SGD) method for the nonconvex nonsmooth optimization, and propose an accelerated SGD method by combining the variance reduction technique with Nesterov’s extrapolation technique. Moreover, based on the local error bound condition, we establish the linear convergence of our method to obtain a stationary point of the nonconvex optimization....
متن کاملA New Ridge Estimator in Linear Measurement Error Model with Stochastic Linear Restrictions
In this paper, we propose a new ridge-type estimator called the new mixed ridge estimator (NMRE) by unifying the sample and prior information in linear measurement error model with additional stochastic linear restrictions. The new estimator is a generalization of the mixed estimator (ME) and ridge estimator (RE). The performances of this new estimator and mixed ridge estimator (MRE) against th...
متن کاملStability and Convergence Trade-off of Iterative Optimization Algorithms
The overall performance or expected excess risk of an iterative machine learning algorithm can be decomposed into training error and generalization error. While the former is controlled by its convergence analysis, the latter can be tightly handled by algorithmic stability (Bousquet and Elisseeff, 2002). The machine learning community has a rich history investigating convergence and stability s...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1710.10737 شماره
صفحات -
تاریخ انتشار 2017